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Abstract 

Background: Sharing sets of chemical data (e.g., chemical properties, docking scores, etc.) among collaborators 
with diverse skill sets is a common task in computer-aided drug design and medicinal chemistry. The ability to 
associate this data with images of the relevant molecular structures greatly facilitates scientific communication. 
There is a need for a simple, free, open-source program that can automatically export aggregated reports of entire 
chemical data sets to files viewable on any computer, regardless of the operating system and without requiring the 
installation of additional software. 

Results: We here present a program called WebChem Viewer that automatically generates these types of highly 
portable reports. Furthermore, in designing WebChem Viewer we have also created a useful online web 
application for remotely generating molecular structures from SMILES strings. We encourage the direct use of 
this online application as well as its incorporation into other software packages. 

Conclusions: With these features, WebChem Viewer enables interdisciplinary collaborations that require the 
sharing and visualization of small molecule structures and associated sets of heterogeneous chemical data. The 
program is released under the FreeBSD license and can be downloaded from http://nbcr.ucsd.edu/ 
WebChemViewer. The associated web application (called "Smiley2png 1.0") can be accessed through freely 
available web services provided by the National Biomedical Computation Resource at http://nbcr.ucsd.edu. 
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Background 

Biological and chemical projects often generate large 
amounts of chemical data, ranging from percent yields to 
docking scores to in vivo drug activities. Sharing these 
data sets effectively is a common task that is greatly en- 
hanced when images of the relevant molecular structures 
are incorporated into collaborative reports. There is a 
need for a simple, free, open-source program that can 
automatically generate data files viewable on any com- 
puter, regardless of the operating system and without re- 
quiring the installation of additional software. To this end, 
we have created a program called WebChem Viewer with 
unique capabilities currently lacking in similar software 
packages. The program can be downloaded from http:// 
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nbcr.ucsd.edu/WebChemViewer and is released under the 
FreeBSD license. 

By default, WebChem Viewer inserts images of molecu- 
lar structures into user-provided data sets using the Open 
Babel software package [1]. However, to simplify the user 
experience we also created an online web application (i.e., 
Opal service [2]) called "Smiley2png 1.0" that can generate 
these images remotely, thus eliminating the need for a 
local Open Babel installation. We encourage the use of 
this remote service independent of WebChem Viewer, 
both directly through its web interface and programmatic- 
ally as a component of other software packages. The 
service can be accessed through the National Biomedical 
Computation Resources Web Services Opal Dashboard, 
which is directly linked to from the NBCR homepage at 
http://nbcr.ucsd.edu. Tutorials describing how to use 
WebChem Viewer and Simley2png can be found in the 
Supporting Information (Additional files 1 and 2). 

i Central Ltd. This is an Open Access article distributed under the terms of the 
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Implementation 

WebChem Viewer generates HTML-formatted output 
that can be viewed in any modern web browser without 
requiring the installation of additional software or plu- 
gins. Collaborators need only open the output file in their 
browsers to view the chemical data sets with associated 
molecular representations. The data is sortable by any col- 
umn (Figure IB) and is fully searchable (Figure 1C). 
Data columns can also be hidden/displayed as needed 
(Figure ID). These features are provided by the JQuery 
[3] and DataTables [4] javascript libraries, which are 
released under the MIT and BSD licenses, respectively, 
as well as by custom javascript and HTML code cre- 
ated by the authors. The two-dimensional molecular 
images included in the output are programmatically 
generated from user-provided SMILES strings. 

To aid those generating these output files, we also kept 
the required dependencies of WebChem Viewer itself to 
a minimum. Strictly speaking, the program requires only 
a python interpreter. Python comes installed by default 
on OS X and most Linux distributions. Simple-to-use in- 
stallers are available for Windows as well. We recommend 



the free Anaconda python distribution provided by Con- 
tinuum Analytics, Inc. (http://continuum.io/downloads). 

WebChem Viewers enhanced features may require 
some additional installations. For example, the program 
includes a graphical user interface (GUI, Figure 2) for 
those not comfortable using the command line. The GUI 
requires that Tkinter [5], a python binding to the Tk GUI 
toolkit [6], be installed. Fortunately, as Tkinter is included 
in the standard Windows and OS X python distributions 
as well as many Linux distributions, we expect the major- 
ity of users will have access to the GUI "out of the box." 
Further details can be found in the Results and Discussion 
section below. 

WebChem Viewer also uses Open Babel [1] and the 
accompanying Cairo 2D graphics library to generate 
two-dimensional molecular images from SMILES strings 
(Figure 1A). Most users will not need to install these 
programs on their machines either. WebChem Viewer 
automatically connects to a remote image -generating 
server (called "Simley2png 1.0") if Open Babel is not 
available locally. If users are concerned about posting 
their data to a public server, or if they wish to generate 
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Figure 1 Sample WebChem-Viewer output. A) Two-dimensional representations of each molecule are provided by Open Babel or a remote 

server. B) The data set can be sorted by any column. C) The data is fully searchable. D) Data columns can be hidden/displayed as required. 
\ J 
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WebChem Viewer 



1. Load the Data 



| Choose CSV File ) No CSV file s pecified 

(•) Tab-Separated Values Comma-Separated Values 



Load Data 

Select your input file by clicking on the "Choose 
CSV File" button. Once the file has been selected, 
click on the buttons below to specify whether your 
data is separated by tabs or commas. Next, click 
the "Load Data" button. 

Note that, regardless of your data's format, the 
first line must contain data labels. 



2. Provide Further Information About Your Data 

Select SMILES column: 
Select sort column: 

• Sort Ascending Sort Descending 



First, select the name of the column that contains 
the SMILES strings. Next, select the name of the 
column you want to be initially sorted. Finally, 
specify whether those values should be sorted 
ascending or descending. 

Note that the user will have the ability to resort the 
data once the file is opened in her browser. 



3. Locate the OpenBabel Executable 



Choose obabel File Specify location of obabel 



OpenBabel (openbabel.org) is used to generate images. Please specify the location of the obabel/obabel.exe 
executable file the first time you use this program. If obabel is not specified, WebChem Viewer will try to 
generate the images using a remote server, with some limitations. 



4. Program Output 

• Save as a Single File, with all Dependencies Embedded Sort as a Directory Containing Multiple Files 
Choose the Output File/Directory No output file/ directory specified 

Select Which Data Will be Initially Hidden 
Initially Visible Initially Hidden 



Create File! 



Figure 2 The WebChem-Viewer graphical user interface (GUI). 



more than the server s maximum permitted number of 
images (currently 200), they can install Open Babel on 
their own machines. Versions are available for all major 
operating systems. 

Results and discussion 

A number of programs exist for sharing tabular lists of 
heterogeneous molecular data with associated structures. 
However, frequently molecular data sets must be shared 
with collaborators who don't have the required software 
installed on their computers; some packages are not 
available on all operating systems; many of the relevant 
tools are prohibitively expensive; and many programs, 
feature rich by design, are excessively complex for sim- 
ple data sharing. In order to address these challenges, 
we have created a program called WebChem Viewer 
capable of organizing molecular data sets in a visual 
and intuitive way. 



WebChem viewer input 

WebChem Viewer accepts two types of molecular-data 
tabular input files. First, the user can specify a file where 
each data point is separated by a comma, with comma- 
containing entries placed in quotes (Table 1). This is the 
standard comma-separated-values (CSV) format used 
by many programs such as Microsoft Excel. Second, 
the user can specify a file where each data entry is sepa- 
rated by a tab character (i.e., tab- delimited, Table 1). WeVe 
found that we often generate tab -delimited files when 
using the Unix paste command. 

Regardless of the specific format used, the first row 
of the input file must contain column labels, and sub- 
sequent rows must contain data listed in the same 
order. Each row represents a single molecule; row en- 
tries might include any molecule-associated data, such 
as the molecular name, weight, SMILES string, or a 
docking score. 
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Table 1 Examples of the program input files 

Format Example 

CSV (Excel) ID Number, SMILES, Docking, Source, Synthesis Method 

PLZKL-OMM-712, dccccd, -5.2, "Pilzukil, Pharmaceuticals", Click Chemistry 

PLZKL-OMM-677, CCOC, -9.3, Owen Moore Monet Pharma, Natural Product 
Tab delimited ID Number SMILES Docking Source Synthesis Method 

PLZKL-OMM-712 dccccd -5.2 Pilzukil Pharmaceuticals Click Chemistry 

PLZKL-OMM-677 CCOC -9.3 Owen Moore Monet Pharm Natural Product 

Both comma-separated values (CSV) and tab-delimited data are accepted. The first row must contain data labels. Subsequent rows contain the data associated 
with each molecule. 



Next, the user provides WebChem Viewer with infor- 
mation about the input data set. First, the user indicates 
which of the data columns contains the required SMILES 
strings, from which two-dimensional representations of 
each molecule are automatically generated. The user next 
indicates the column to use to initially sort the data, either 
in ascending or descending order. For example, we often 
wish to communicate the results of our virtual screens 
with collaborators. The docking scores associated with 
each molecule attempt to predict the free energy of bind- 
ing; consequently, more negative scores represent better 
candidate ligands. A reasonable initial sorting, then, might 
be to order the data by the docking score, ascending from 
the lowest (most negative) value to the highest (least 
negative). 

Finally, the user can specify which columns to initially 
hide. In our experience, it is often helpful to include 
supplementary information about each molecule that is 
useful but not critical for understanding. This data can 
be initially hidden and subsequently viewed by collabo- 
rators only when specifically requested. For example, in 
presenting the results of a virtual screen, the molecule 
name, structure, and docking score are clearly para- 
mount. Associated data like the number of Lipinski vio- 
lations [7] are useful but not necessarily critical, and so 
might be initially hidden. 

WebChem viewer output/portability 

WebChem Viewer produces output that is HTML for- 
matted. HTML is the same format used to create inter- 
net web pages; consequently, the output can be viewed 
in any modern web browser, on any computer operating 
system (including mobile devices). In our experience, 
this degree of portability is critical for may projects. 
For example, computational chemists almost exclu- 
sively use Unix-based operating systems, while most 
other researchers use Windows. Those research groups 
that have their own systems for sharing sets of molecu- 
lar data (and many do not) typically resort to programs 
that are expensive, excessively complex, or operating- 
system dependent. It is not always practical to tell collabo- 
rators that they need to buy PerkinElmer s ChemDraw or 



download Schrodingers Maestro, especially given that 
these packages include many features not required for 
simple data sharing. By divorcing data sharing from any 
specific operating system or computer program and in- 
stead wedding it to the ubiquitous web browser, collabora- 
tors need only click an html file sent by email in order to 
visualize the data immediately. 

These challenges are hardly unique to scientific shar- 
ing. They are the very issues driving the current broader 
interest in "cloud computing" (i.e., deploying desktop or 
mobile apps through the internet rather than through 
the operating system). Given the recent proliferation of 
available operating systems (Android, iOS, Mac OSX, 
Windows, Linux, etc.) and the potentially imminent 
ascent of a number of others (e.g., Chrome and Firefox 
OS), operating-system independence is more critical 
than ever. 

Given that WebChem Viewers output is so inherently 
portable, it is ideally suited for incorporation into exist- 
ing cloud-based chemistry applications. As we expect 
most users will use WebChem Viewer to simply share 
data sets with colleagues, by default the program gener- 
ates a single HTML file with all the required dependencies 
(javascript, ess style-sheets, and image files) embedded dir- 
ectly into the code. However, WebChem Viewer can also 
save its HTML and dependencies as separate files so that 
the relevant portions of the output can be more easily in- 
corporated into existing web-app frameworks. Indeed, in 
collaboration with the National Biomedical Computation 
Resource, we are currently pursuing plans to incorporate 
WebChem Viewer into a number of online chemistry ap- 
plications. We are hopeful that other organizations will 
similarly find that WebChem Viewer satisfies their cloud- 
based project needs. 

Smiley2png 1.0 opal service 

If required, users and software developers are also in- 
vited to access the remote Smiley2png 1.0 Opal ser- 
vice that permits WebChem Viewer to generate images 
of molecular structures "in the cloud," either directly 
through the National Biomedical Computation Resources 
Opal Dashboard or programmatically through an Opal 
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client (https://sourceforge.net/projects/opaltoolkit/files/opal- 
python/). The web interface accepts three parameters. First, 
the user must provide a text file, where each line represents 
a single compound and includes the name of the image file 
to generate as well as the associated SMILES string, sepa- 
rated by a space. Additionally, the user can specify the size 
of the square PNG image to generate, in pixels, as well as an 
email address for job-completion notification. 

Comparing WebChem viewer to existing software 
packages 

WebChem Viewer has a number of advantages over 
existing packages. 1) It is simple and easy to use because 
it is singularly dedicated to generating easy-to-interpret 
reports. The user need only provide the data set, specify 
a few output options, and click the "Run" button to gen- 
erate highly portable output files. 2) It is free for all 
users, including researchers working in industry, and is 
entirely open source so that knowledgeable users can 
modify the program according to their needs. 3) The 
program itself requires only a python interpreter and so 
can run on all modern desktop operating systems. 4) 
The output can be viewed in any web browser, includ- 
ing on mobile devices, without the need for additional 
plugins/programs. 4) The output is HTML formatted 
and so can be easily incorporated into existing websites 
to allow even broader data sharing if required. 

Table 2 demonstrates how WebChem Viewer com- 
pares to Schrodingers Maestro Suite [8], PerkinElmers 
ChemDraw [9], and ChemAxons Instant JChem, three 
popular chemical-database management packages. Like 
WebChem Viewer, all three of these programs have Graph- 
ical User Interfaces that allow the user to generate tabular 
chemical reports that include both structural images and 
associated molecular information. However, WebChem 
Viewer has a number of advantages that are worthy of 
mention. 

Schrodingers closed-source Maestro Suite, the "unified 
interface for all Schrodinger software," includes many 



useful tools that facilitate computer-aided drug discov- 
ery and computational biology generally. However, 
when one wants only to generate a simple report to 
share with collaborators, these additional features ad- 
versely impact simplicity and usability. Maestro s com- 
plexity aside, Schrodingers free version is capable of 
producing tabular reports that include images of chemical 
structures. If collaborators are willing to download Maes- 
tro as well (1.4 GB as of 11/2013), they can search and 
sort any Maestro-formatted molecular data shared with 
them. Maestro runs under Linux, Windows, and older 
versions of OSX, but is not currently compatible with 
OSX 10.9; furthermore, Maestro-formatted data files can- 
not be viewed on mobile devices or easily incorporated 
into existing web pages. Given WebChem Viewer s simpli- 
city and portability, we believe it is better suited for the 
singular task of generating simple compound-library 
reports. 

PerkinElmers ChemDraw is a chemical drawing pro- 
gram with myriad tools for conversion, enumeration, 
querying, etc. These features, while useful in many con- 
texts, again adversely impact simplicity and usability 
when one wishes only to generate a simple report. Fur- 
thermore, ChemDraw is closed source and costly (com- 
mercial licenses range from $540 to $1,540) and can 
only organize associated data in a tabular format via a 
Microsoft Excel plugin. Neither ChemDraw nor the Excel 
plugin run under Linux, and the plugin is incompatible 
with OSX as well. Furthermore, there is no easy mechan- 
ism for incorporating the output into existing web pages 
for broad data sharing. 

Finally, ChemAxons Instant JChem, like WebChem 
Viewer, is singularly focused on displaying and organiz- 
ing the contents of molecular databases. Unlike Web- 
Chem Viewer, however, Instant JChem is closed source 
and expensive ($420-$l,610 depending on the license), 
though ChemAxon does provide a free version for aca- 
demics and a free viewer for all researchers. Instant 
JChem can also export molecular data sets to PDF and 



Table 2 Software comparison 



Name 



Simplicity/usability 



Programs required to view 
the output 



Cost/open source 



Operating systems 



WebChem Viewer 



Maestro 



ChemDraw with 
Microsoft Excel 

Instant JChem 



A single program dedicated only to 
generating reports. 

"The unified interface for all Schrodinger 
software," with advanced tools to support 
molecular modeling, drug discovery, etc. 

A full featured chemical drawing program 
with myriad tools for conversion, querying, 
enumerating, etc. 

Straightforward interface for managing 
molecular databases. 



Any modern web browser 



Maestro 



ChemDraw, Microsoft Excel 



Free for everyone. Open 
source. 

Free version available. 
Closed source. 

Commercial license: 
$540-$ 1,540. Closed 
source. 



Instant JChem, Adobe Acrobat Free for academics, 
Reader, Microsoft Excel $420-$ 1610 otherwise. 

Closed source. 



Linux, OSX, Windows 



Linux, older versions 
of OSX, Windows. 



Windows only 



Linux, OSX, Windows 



WebChem Viewer compared to Schrodinger's Maestro, PerkinElmer's ChemDraw, and ChemAxon's Instant JChem. 
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Microsoft Excel files for collaborators who don't wish to 
download ChemAxons software. Like WebChem Viewer, 
Instant JChem runs on all modern desktop operating sys- 
tems, and its PDF-formatted output is highly portable. 

Furthermore, recognizing the modern importance of 
being able to share molecular data over the web, Che- 
mAxon has also developed a version of Instant JChem 
that runs through Java Web Start. While the Web-Start ver- 
sion has its utility, WebChem Viewers HTML-formatted 
output is even better suited for web sharing. Due to recent 
Java security vulnerabilities, many users cannot run Java ap- 
plets in their web browsers. For example, an analysis of user 
data collected from 17,514 people who visited the authors 
personal website over the course of a recent month suggests 
that only 69% had browsers with Java enabled. In con- 
trast, WebChem Viewers output does not require a 
Java installation. 

Distributed drug discovery: a test case 

To verify WebChem Viewers utility in a real-life test 
situation, we recently used the program to further col- 
laborations with the Distributed Drug Discovery (D3) 
initiative [10-12]. D3 is an educational initiative that al- 
lows undergraduate students to generate and test chem- 
ical compounds that could one day be developed into 
new drugs. Our efforts have focused on using computer- 
aided drug-design techniques to guide future student 
synthesis. In this context, we've needed to share large 
amounts of chemical data with D3 collaborators. 

WebChem Viewer has greatly facilitated D3-based 
efforts. Our collaborators have specifically commen- 
ted on the utility of the sorting, searching, and column- 
hiding features. Additionally, because WebChem- Viewer 
output is HTML formatted, we have been able to 
modify its appearance according to our collaborators' 
requests. 

In the past, we used Microsoft Excel to tabulate our 
data when sharing with collaborators. As our particular 
operating system is not compatible with ChemDraw's 
Excel plugin, we were forced to manually convert individual 
SMILES strings to images on a compound-by-compound 
basis and to tediously copy the resulting images from 
ChemDraw into Excel. As good collaborations often 
involve back-and-forth feedback, a given project fre- 
quently required us to repeat this process many times 
as we modified our computational protocols in re- 
sponse to collaborators' suggestions. 

In contrast, with WebChem Viewer the D3 collabor- 
ation has been streamlined. Structural images are incor- 
porated into the reports automatically, thus lowering the 
barrier required to implement new suggestions. These 
benefits were obtained without requiring our collabora- 
tors, who are not computationalists, to install additional 
software. 



Table 3 Webchem viewer operating-system compatibility 



Operating system 


Open babel version 


Python version 


Scientific Linux 6.2 


2.3.1 


2.6.6 


Mac 0SX 10.8.3 


2.3.1 


2.7.2 


Windows XP Professional 


2.3.2 


2.5 


Windows XP Professional 


2.3.2 


2.6 


Windows XP Professional 


2.3.2 


2.7.3 



WebChem Viewer has been tested on a number of operating systems, with 
various Open-Babel and Python versions. We note that most users will not 
need to download and install Open Babel on their own machines. 



Stability 

To test the stability of WebChem Viewer, we first ob- 
tained a list of 162,161 SMILES strings by downloading 
and processing the "Clean Fragments" subset of the 
ZINC database [13]. We then generated 200 tabular in- 
put files by randomly selecting 15 SMILES strings per 
file and associating 5 to 10 dummy variables with each 
compound. These dummy variables consisted of randomly 
chosen numbers ranging from 0 to 100 and/or randomly 
generated text sequences of 10 letters. WebChem Viewer 
processed the first 100 input files using a local copy of 
Open Babel to generate molecular-structure images. 
The last 100 input files were similarly processed, ex- 
cept the remote image-generating server ("Smiley2png") 
was employed. In all cases, WebChem Viewer produced 
the appropriate output files without any errors. 

Conclusions 

WebChem Viewer provides a simple and free way to 
share substantial quantities of heterogeneous chemical 
data. The program has been specifically tested on a 
number of operating systems, using several different ver- 
sions of Open Babel and Python (Table 3). Additionally, 
WebChem- Viewer output has been successfully visualized 

Table 4 Webchem viewer browser compatibility 



Operating system Web browser 



Scientific Linux 6.2 


Chrome 26.0.1410.63 


Scientific Linux 6.2 


Firefox ESR 17.0.5 


Mac OSX 10.8.3 


Chrome 29.0.1547.32 beta 


Mac OSX 10.8.3 


Firefox 22.0 


Mac OSX 10.8.3 


Safari 6.0.4 


Android 4.1.2 (Tablet) 


Chrome 28 


Android 4.1.2 (Tablet) 


Default Android Browser (as of 8/2013) 


iOS 6.1.3 (Tablet) 


Mobile Safari 6 


Windows XP Professional 


Firefox 3.0.6 


Windows XP Professional 


Chrome 28.0.1500.95 m 


Windows XP Professional 


Internet Explorer 8.0.6001.18702 


Windows 7 


Internet Explorer 10.0.9200.16635 



WebChem Viewer's HTML output files have been successfully visualized on 
many web browsers, running under several operating systems. 
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in a number of web browsers (Table 4). Sample data sets 
in both the CSV and tab-delimited formats is provided 
with the download so that interested users can easily ex- 
periment with the program. Tutorials are also included in 
the Supporting Information (Additional files 1 and 2). We 
are hopeful that both WebChem Viewer as well as its as- 
sociated web application for generating images of molecu- 
lar structures will be useful tools for computational and 
medicinal chemists, as well as their collaborators. 

Availability and requirements 

Project name: WebChem Viewer 
Project home page: http://nbcr.ucsd.edu/WebChemViewer 
Operating systems: Platform independent 
Programming language: Python, HTML, JavaScript 
Other requirements: Python 2.x (tested on versions 2.5 
and higher), Open Babel (optional if the user wants to 
generate molecular images locally rather than using our 
sever application; tested on versions 2.3.1 and 2.3.2) 
License: FreeBSD license 

Any restrictions to use by non-academics: None 
Additional files 



Additional file 1: WebChem Viewer Tutorial. 
Additional file 2: Smiley2png 1.0 Tutorial. 
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